26 research outputs found

    Recent Developments in Deep Learning Applied to Protein Structure Prediction

    Get PDF
    Although many structural bioinformatics tools have been using neural network models for a long time, deep neural network (DNN) models have attracted considerable interest in recent years. Methods employing DNNs have had a significant impact in recent CASP experiments, notably in CASP12 and especially CASP13. In this article, we offer a brief introduction to some of the key principles and properties of DNN models and discuss why they are naturally suited to certain problems in structural bioinformatics. We also briefly discuss methodological improvements that have enabled these successes. Using the contact prediction task as an example, we also speculate why DNN models are able to produce reasonably accurate predictions even in the absence of many homologues for a given target sequence, a result which can at first glance appear surprising given the lack of input information. We end on some thoughts about how and why these types of models can be so effective, as well as a discussion on potential pitfalls. This article is protected by copyright. All rights reserved

    Adaptive HIV-1 evolutionary trajectories are constrained by protein stability

    Get PDF
    Despite the use of combination antiretroviral drugs for the treatment of HIV-1 infection, the emergence of drug resistance remains a problem. Resistance may be conferred either by a single mutation or a concerted set of mutations. The involvement of multiple mutations can arise due to interactions between sites in the amino acid sequence as a consequence of the need to maintain protein structure. To better understand the nature of such epistatic interactions, we reconstructed the ancestral sequences of HIV-1’s Pol protein, and traced the evolutionary trajectories leading to mutations associated with drug resistance. Using contemporary and ancestral sequences we modelled the effects of mutations (i.e. amino acid replacements) on protein structure to understand the functional effects of residue changes. Although the majority of resistance-associated sequences tend to destabilise the protein structure, we find there is a general tendency for protein stability to decrease across HIV-1’s evolutionary history. That a similar pattern is observed in the non-drug resistance lineages indicates that non-resistant mutations, for example, associated with escape from the immune response, also impacts on protein stability. Maintenance of optimal protein structure therefore represents a major constraining factor to the evolution of HIV-1

    A guide to machine learning for biologists

    Get PDF
    The expanding scale and inherent complexity of biological data have encouraged a growing use of machine learning in biology to build informative and predictive models of the underlying biological processes. All machine learning techniques fit models to data; however, the specific methods are quite varied and can at first glance seem bewildering. In this Review, we aim to provide readers with a gentle introduction to a few key machine learning techniques, including the most recently developed and widely used techniques involving deep neural networks. We describe how different techniques may be suited to specific types of biological data, and also discuss some best practices and points to consider when one is embarking on experiments involving machine learning. Some emerging directions in machine learning methodology are also discussed

    Ultrafast end-to-end protein structure prediction enables high-throughput exploration of uncharacterized proteins

    Get PDF
    Deep learning-based prediction of protein structure usually begins by constructing a multiple sequence alignment (MSA) containing homologs of the target protein. The most successful approaches combine large feature sets derived from MSAs, and considerable computational effort is spent deriving these input features. We present a method that greatly reduces the amount of preprocessing required for a target MSA, while producing main chain coordinates as a direct output of a deep neural network. The network makes use of just three recurrent networks and a stack of residual convolutional layers, making the predictor very fast to run, and easy to install and use. Our approach constructs a directly learned representation of the sequences in an MSA, starting from a one-hot encoding of the sequences. When supplemented with an approximate precision matrix, the learned representation can be used to produce structural models of comparable or greater accuracy as compared to our original DMPfold method, while requiring less than a second to produce a typical model. This level of accuracy and speed allows very large-scale three-dimensional modeling of proteins on minimal hardware, and we demonstrate this by producing models for over 1.3 million uncharacterized regions of proteins extracted from the BFD sequence clusters. After constructing an initial set of approximate models, we select a confident subset of over 30,000 models for further refinement and analysis, revealing putative novel protein folds. We also provide updated models for over 5,000 Pfam families studied in the original DMPfold paper

    Reliable Generation of Native-Like Decoys Limits Predictive Ability in Fragment-Based Protein Structure Prediction

    Get PDF
    Our previous work with fragment-assembly methods has demonstrated specific deficiencies in conformational sampling behaviour that, when addressed through improved sampling algorithms, can lead to more reliable prediction of tertiary protein structure when good fragments are available, and when score values can be relied upon to guide the search to the native basin. In this paper, we present preliminary investigations into two important questions arising from more difficult prediction problems. First, we investigated the extent to which native-like conformational states are generated during multiple runs of our search protocols. We determined that, in cases of difficult prediction, native-like decoys are rarely or never generated. Second, we developed a scheme for decoy retention that balances the objectives of retaining low-scoring structures and retaining conformationally diverse structures sampled during the course of the search. Our method succeeds at retaining more diverse sets of structures, and, for a few targets, more native-like solutions are retained as compared to our original, energy-based retention scheme. However, in general, we found that the rate at which native-like structural states are generated has a much stronger effect on eventual distributions of predictive accuracy in the decoy sets, as compared to the specific decoy retention strategy used. We found that our protocols show differences in their ability to access native-like states for some targets, and this may explain some of the differences in predictive performance seen between these methods. There appears to be an interaction between fragment sets and move operators, which influences the accessibility of native-like structures for given targets. Our results point to clear directions for further improvements in fragment-based methods, which are likely to enable higher accuracy predictions

    Improved fragment-based protein structure prediction by redesign of search heuristics

    Get PDF
    Difficulty in sampling large and complex conformational spaces remains a key limitation in fragment-based de novo prediction of protein structure. Our previous work has shown that even for small-to-medium-sized proteins, some current methods inadequately sample alternative structures. We have developed two new conformational sampling techniques, one employing a bilevel optimisation framework and the other employing iterated local search. We combine strategies of forced structural perturbation (where some fragment insertions are accepted regardless of their impact on scores) and greedy local optimisation, allowing greater exploration of the available conformational space. Comparisons against the Rosetta Abinitio method indicate that our protocols more frequently generate native-like predictions for many targets, even following the low-resolution phase, using a given set of fragment libraries. By contrasting results across two different fragment sets, we show that our methods are able to better take advantage of high-quality fragments. These improvements can also translate into more reliable identification of near-native structures in a simple clustering-based model selection procedure. We show that when fragment libraries are sufficiently well-constructed, improved breadth of exploration within runs improves prediction accuracy. Our results also suggest that in benchmarking scenarios, a total exclusion of fragments drawn from homologous templates can make performance differences between methods appear less pronounced

    Uncommon mutational profiles of metastatic colorectal cancer detected during routine genotyping using next generation sequencing

    Get PDF
    RAS genotyping is mandatory to predict anti-EGFR monoclonal antibodies (mAbs) therapy resistance and BRAF genotyping is a relevant prognosis marker in patients with metastatic colorectal cancer. Although the role of hotspot mutations is well defined, the impact of uncommon mutations is still unknown. In this study, we aimed to discuss the potential utility of detecting uncommon RAS and BRAF mutation profiles with next-generation sequencing. A total of 779 FFPE samples from patients with metastatic colorectal cancer with valid NGS results were screened and 22 uncommon mutational profiles of KRAS, NRAS and BRAF genes were selected. In silico prediction of mutation impact was then assessed by 2 predictive scores and a structural protein modelling. Three samples carry a single KRAS non-hotspot mutation, one a single NRAS non-hotspot mutation, four a single BRAF non-hotspot mutation and fourteen carry several mutations. This in silico study shows that some non-hotspot RAS mutations seem to behave like hotspot mutations and warrant further examination to assess whether they should confer a resistance to anti-EGFR mAbs therapy for patients bearing these non-hotspot RAS mutations. For BRAF gene, non-V600E mutations may characterise a novel subtype of mCRC with better prognosis, potentially implying a modification of therapeutic strategy

    A Flavonoid, Luteolin, Cripples HIV-1 by Abrogation of Tat Function

    Get PDF
    Despite the effectiveness of combination antiretroviral treatment (cART) against HIV-1, evidence indicates that residual infection persists in different cell types. Intensification of cART does not decrease the residual viral load or immune activation. cART restricts the synthesis of infectious virus but does not curtail HIV-1 transcription and translation from either the integrated or unintegrated viral genomes in infected cells. All treated patients with full viral suppression actually have low-level viremia. More than 60% of treated individuals also develop minor HIV-1 –associated neurocognitive deficits (HAND) due to residual virus and immune activation. Thus, new therapeutic agents are needed to curtail HIV-1 transcription and residual virus. In this study, luteolin, a dietary supplement, profoundly reduced HIV-1 infection in reporter cells and primary lymphocytes. HIV-1inhibition by luteolin was independent of viral entry, as shown by the fact that wild-type and VSV–pseudotyped HIV-1 infections were similarly inhibited. Luteolin was unable to inhibit viral reverse transcription. Luteolin had antiviral activity in a latent HIV-1 reactivation model and effectively ablated both clade-B- and -C -Tat-driven LTR transactivation in reporter assays but had no effect on Tat expression and its sub-cellular localization. We conclude that luteolin confers anti–HIV-1 activity at the Tat functional level. Given its biosafety profile and ability to cross the blood-brain barrier, luteolin may serve as a base flavonoid to develop potent anti–HIV-1 derivatives to complement cART

    The importance of nerve microenvironment for schwannoma development

    Get PDF
    Schwannomas are predominantly benign nerve sheath neoplasms caused by Nf2 gene inactivation. Presently, treatment options are mainly limited to surgical tumor resection due to the lack of effective pharmacological drugs. Although the mechanistic understanding of Nf2 gene function has advanced, it has so far been primarily restricted to Schwann cell-intrinsic events. Extracellular cues determining Schwann cell behavior with regard to schwannoma development remain unknown. Here we show pro-tumourigenic microenvironmental effects on Schwann cells where an altered axonal microenvironment in cooperation with injury signals contribute to a persistent regenerative Schwann cell response promoting schwannoma development. Specifically in genetically engineered mice following crush injuries on sciatic nerves, we found macroscopic nerve swellings in mice with homozygous nf2 gene deletion in Schwann cells and in animals with heterozygous nf2 knockout in both Schwann cells and axons. However, patient-mimicking schwannomas could only be provoked in animals with combined heterozygous nf2 knockout in Schwann cells and axons. We identified a severe re-myelination defect and sustained macrophage presence in the tumor tissue as major abnormalities. Strikingly, treatment of tumor-developing mice after nerve crush injury with medium-dose aspirin significantly decreased schwannoma progression in this disease model. Our results suggest a multifactorial concept for schwannoma formation-emphasizing axonal factors and mechanical nerve irritation as predilection site for schwannoma development. Furthermore, we provide evidence supporting the potential efficacy of anti-inflammatory drugs in the treatment of schwannomas
    corecore